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DETAILED ACTION 

1 . This is in response to the Applicant's arguments filed 29 September 2004, with 
respect to claims 26-34 have been fully considered and are persuasive, in which claims 
26-34 remain pending. 

REASONS FOR ALLOWANCE 

2. With respect to claims 26-34 are allowed over the prior art of record. 
The following is an examiner's statement of reasons for allowance: 

With respect to claim 26, the claimed features "for a current leaf node among the leaf 
nodes of the decision tree, computing a lowest value of a gini index achieved by 
univariate-based partitions on each of a plurality of attribute lists included in the current 
leaf node; and wherein the gini index is equal to 1-(P_n) 2 - (P„p) 2 , P_n being a 
percentage of the records of the non-target class in the input data set and P_p being a 
percentage of the records of the target class in the input data set" in conjunction with 
other elements of the independent claims would not found anticipated or obvious over 
the prior art made of record. With respect to claim 27, the claimed features "for a 
current leaf node from among the leaf nodes of the decision tree, computing a lowest 
value of a gini index achieved by univariante-based partitions on each of a plurality of 
attribute lists included in the current leaf node; and wherein the percentage of the 
records P_p in the input data set is equal to W _p*n_p/(W_p*n_p + n_n), W_p being a 
weight of the records of the target class in the input data set, n_p and n_n being a 
number of the records of the target class and a number of the records of the non-target 
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class in the current leaf node, respectively" in conjunction with other elements of the 
independent claims would not found anticipated or obvious over the prior art made of 
record. With respect to claim 28, the claimed features "computing a lowest value of a 
gini index achieved by univariate-based partitions on each of a plurality of attribute lists 
included in the current leaf node; and wherein said partitioning step further comprises 
the steps of: detecting subspace clusters of the records of the target class associated 
with the current leaf node; computing the lowest value of the gini index achieved by 
distance-based partitions on each of the plurality of attribute lists, included in the current 
leaf node, the distance based partitions being based on distances to the detected 
subspace clusters; partitioning pre-sorted attribute lists included in the current node into 
two sets of ordered attribute lists based upon a greater one of the lowest value of the 
gini index achieved by univariate partitions and the lowest value of the gini index 
achieved by distance-based partitions; and creating new child nodes for each of the two 
sets of ordered attribute lists; and wherein said detecting step comprises the steps of: 
computing a minimum support (minsup) of each of the subspace clusters that have a 
potential of providing a lower gini index than that provided by the univariate-based 
partitions; identifying one-dimensional clusters of the records of the target class 
associated with the current leaf node; beginning with the one-dimensional clusters, 
combining centroids of K dimensional clusters to form candidate (K+1 )-dimensional 
clusters; identifying a number of the records of the target class that fall into each of the 
(K+l)-dimensional clusters; pruning any of the (K+1 )-dimensional clusters that have a 
support lower than the minsup" in conjunction with other elements of the independent 
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claims would not found anticipated or obvious over the prior art made of record. With 
respect to claim 32, the claimed features "detecting subspace clusters of the records of 
the target class associated with the current leaf node; computing the lowest value of the 
gini index achieved by distance-based partitions on each of the plurality of attribute lists 
included in the current leaf node, the distance based partitions being based on 
distances to the detected subspace clusters; partitioning pre-sorted attribute lists 
included in the current node into two sets of ordered attribute lists based upon a greater 
one of the lowest value of the gini index achieved by univariate partitions and the lowest 
value of the gini index achieved by distance-based partitions; and creating new child 
nodes for each of the two sets of ordered attribute lists; and wherein said step of 
computing the lowest value of the gini index achieved by distance-based partitions 
comprises the steps of: identifying eligible subspace clusters from among the subspace 
clusters, an eligible subspace cluster having a set of clustered dimensions such that 
only less than all of the clustered dimensions in the set are capable of being included in 
another set of clustered dimensions of another subspace cluster; selecting top-K 
clusters from among the eligible subspace clusters, the top-K clusters being ordered by 
a number of records therein; for each of a current top-K cluster, computing a centroid of 
the current top-K cluster and a weight on each dimension of the current top-K cluster; 
and computing the gini index of the current top-K cluster, based on a weighted 
Euclidean distance to the centroid; and recording a lowest gini index achieved by said 
step of computing the gini index of the current top-K cluster" in conjunction with other 
elements of the independent claims would not found anticipated or obvious over the 
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prior art made of record. With respect to claim 33, the claimed features "computing the 
lowest value of the gini index achieved by distance-based partitions on each of the 
plurality of attribute lists included in the current leaf node, the distance based partitions 
being based on distances to the detected subspace clusters; partitioning pre-sorted 
attribute lists included in the current node into two sets of ordered attribute lists based 
upon a greater one of the lowest value of the gini index achieved by univariate partitions 
and the lowest value of the gini index achieved by distance-based partitions; and 
creating new child nodes for each of the two sets of ordered attribute lists; and wherein 
each of the plurality of pre-sorted attribute lists comprises a plurality of entries, and said 
step of partitioning the pre-sorted attribute lists comprises the steps of determining 
whether univariante partitioning or distance-based partitioning has occurred; creating a 
first hash table that maps record ids of any of the records that satisfy a condition A=v to 
a left child node and that maps the record ids of any of the records that do not satisfy 
the condition A=v to a right child node, A being an attribute and v denoting a splitting 
position, when the univariante partitioning has occurred; creating a second hash table 
that maps the record ids of any of the records that satisfy a condition Dist(d, p, w)=v to a 
left child node and that maps the record ids of any of the records that do not satisfy the 
condition Dist(d, p, w)=v to a right child node, when the distance-based partitioning has 
occurred, d being a record associated with a current subspace cluster, p being a 
centroid of the current subspace cluster, and w being a weight on dimensions of the 
current subspace cluster; partitioning the pre-sorted attribute lists into the two sets of 
ordered attribute lists, based on information in a corresponding one of the first hash 
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table or the second hash table; appending each entry of the two sets of ordered 
attribute lists to one of the left child node or the right child node, based on the 
information in the corresponding one of the first hash table or the second hash table and 
information corresponding to the each entry, to maintain attribute ordering in the two 
sets of ordered attribute lists that corresponds that in the pre-sorted attribute lists" in 
conjunction with other elements of the independent claims would not found anticipated 
or obvious over the prior art made of record. With respect to claim 34, the claimed 
features "wherein said classifying and scoring step comprises the steps of for each of 
the plurality of nodes of the decision tree, starting at the root node, evaluating a Boolean 
condition and following at least one branch of the decision tree until a leaf node is 
reached; classifying the reached leaf node based on a majority class of any of the 
predetermined attributes included therein; for each node in the nearest neighbor set of 
nodes for the reached leaf node, computing a distance between a record to be scored 
and a centroid of the reached leaf node, using a distance function computed for the 
reached leaf node; and scoring the record using a maximum value of a score function, 
the score function defined as conf/dist(d,p,w,), wherein the conf is a confidence of the 
reached node, d is a particular record associated with a current subspace cluster, p is a 
centroid of the current subspace cluster, and w is a weight on dimensions of the 
subspace cluster " in conjunction with other elements of the independent claims would 
not found anticipated or obvious over the prior art made of record. 

The dependent claims, being definite, further limiting, and fully enabled by the 
specification are also allowed. 
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3. The closest prior art, Agrawal et al. U.S. Patent No. 5,799,31 1 relates to data 
mining. Ramaswamy et al. "Efficient Algorithms for Mining Outliers from Large Data 
Sets" but fail to teach the above limitations. 

4. Any comments considered necessary by applicant must be submitted no later 
than the payment of the issue fee and, to avoid processing delays, should preferably 
accompany the issue fee. Such submissions should be clearly labeled "Comments on 
Statement of Reasons for Allowance." 
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CONTACT INFORMATION 



5. Any inquiry concerning this communication or earlier communications from the 
examiner should be directed to JEAN B. FLEURANTIN whose telephone number is 571 
- 272-4035. The examiner can normally be reached on 7:05 to 4:35. 

If attempts to reach the examiner by telephone are unsuccessful, the examiner's 
supervisor, JOHN E BREENE can be reached on 571 - 272-41 07. The fax phone 
number for the organization where this application or proceeding is assigned is 703- 
872-9306. 

Information regarding the status of an application may be obtained from the 
Patent Application Information Retrieval (PAIR) system. Status information for 
published applications may be obtained from either Private PAIR or Public PAIR. 
Status information for unpublished applications is available through Private PAIR only. 
For more information about the PAIR system, see http://pair-direct.uspto.gov. Should 
you have questions on access to the Private PAIR system, contact the Electronic 
Business Center (EBC) at 866-217-9197 (toll-free). 





Jean Bolte Fleurantin 



October 29, 2004 



